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Abstract: This paper presents a novel class of self-organizing sensing agents that adaptively 
learn an anisotropic, spatio-temporal Gaussian process using noisy measurements and move 
in order to improve the quality of the estimated covariance function. This approach is based 
on a class of anisotropic covariance functions of Gaussian processes introduced to model a 
broad range of spatio-temporal physical phenomena. The covariance function is assumed 
to be unknown a priori. Hence, it is estimated by the maximum a posteriori probability 
(MAP) estimator. The prediction of the field of interest is then obtained based on the MAP 
estimate of the covariance function. An optimal sampling strategy is proposed to minimize 
the information-theoretic cost function of the Fisher Information Matrix. Simulation results 
demonstrate the effectiveness and the adaptability of the proposed scheme. 

Keywords: mobile sensor networks; Gaussian processes; adaptive sampling 



1. Introduction 

In recent years, due to global climate changes, more environmental scientists are interested in 
the change of ecosystems over vast regions in lands, oceans, and lakes. For instance, for certain 
environmental conditions, rapidly reproducing harmful algal blooms in the Great Lakes can produce 
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cyanotoxins. Besides such natural disasters, there exist growing ubiquitous possibilities of the release of 
toxic chemicals and contaminants in the air, lakes, and public water systems. This resulted in the rising 
demands to utilize autonomous robotic systems that can perform a series of tasks such as estimation, 
prediction, monitoring, tracking and removal of a scalar field of interest undergoing often complex 
transport phenomena (Common examples are diffusion, convection, and advection). 

Significant enhancements have been made in the areas of mobile sensor networks and mobile sensing 
vehicles such as unmanned ground vehicles, autonomous underwater vehicles, and unmanned aerial 
vehicles. Emerging technologies have been reported on the coordination of mobile sensing agents [1-6]. 
Mobile sensing agents form an ad-hoc wireless communication network in which each agent usually 
operates under a short communication range, with limited memory and computational power. Mobile 
sensing agents are often spatially distributed in an uncertain surveillance environment. 

The mobility of mobile agents can be designed in order to perform the optimal sampling of the 
field of interest. Recently in [5], Leonard et al. developed mobile sensor networks that optimize 
ocean sampling performance defined in terms of uncertainty in a model estimate of a sampled field. 
In [6], distributed learning and cooperative control were developed for multi-agent systems to discover 
peaks of the unknown field based on the recursive estimation of an unknown field. In general, we 
design the mobility of sensing agents to find the most informative locations to make observations for a 
spatio-temporal phenomenon. To find these locations that predict the phenomena best, one needs a model 
of the spatio-temporal phenomenon itself. In our approach, we focus on Gaussian processes to model 
fields undergoing transport phenomena. A Gaussian process (or kriging in geostatistics) has been widely 
used as a nonlinear regression technique to estimate and predict geostatistical data [7-11]. A Gaussian 
process is a natural generalization of the Gaussian probability distribution. It generalizes the Gaussian 
distribution with a finite number of random variables to a Gaussian process with an infinite number of 
random variables in the surveillance region. Gaussian process modeling enables us to predict physical 
values, such as temperature and plume concentration, at any of spatial points with a predicted uncertainty 
level efficiently. For instance, near-optimal static sensor placements with a mutual information criterion 
in Gaussian processes were proposed by [12,13]. Distributed kriged Kalman filter for spatial estimation 
based on mobile sensor networks are developed by [14]. A distributed adaptive sampling approach 
was proposed by [15] for sensor networks to find locations that maximize the information contents by 
assuming that the covariance function is known up to a scaling parameter. Multi-agent systems that 
are versatile for various tasks by exploiting predictive posterior statistics of Gaussian processes were 
developed by [16,17]. 

The motivation of our work is as follows. Even though there have been efforts to utilize Gaussian 
processes to model and predict the spatio-temporal field of interest, most of recent papers assume that 
Gaussian processes are isotropic, implying that the covariance function only depends on the distance 
between locations. Many studies also assume that the corresponding covariance functions are known a 
priori for simplicity. However, this is not the case in general as pointed out in literature [12,13,18], in 
which they treat the non- stationary process by fusing a collection of isotropic spatial Gaussian processes 
associated with a set of local regions. Hence our motivation is to develop theoretically-sound algorithms 
for mobile sensor networks to learn the anisotropic covariance function of a spatio-temporal Gaussian 
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process. Mobile sensing agents can then predict the Gaussian process based on the estimated covariance 
function in a nonparametric manner. 

The contribution of this paper is to develop covariance function learning algorithms for the sensing 
agents to perform nonparametric prediction based on a properly adapted Gaussian process for a given 
spatio-temporal phenomenon. By introducing a generalized covariance function, we expand the class of 
Gaussian processes to include the anisotropic spatio-temporal phenomena. The maximum a posteriori 
probability (MAP) estimator is used to find hyperparameters for the associated covariance function. The 
proposed optimal navigation strategy for autonomous vehicles minimizes the information-theoretic cost 
function such as D-or A-optimality criterion using the Fisher Information Matrix (or Cramer-Rao lower 
bound (CRLB)[19], improving the quality of the estimated covariance function. A Gaussian process with 
a time-varying covariance function has been proposed to demonstrate the adaptability of the proposed 
scheme. 

This paper is organized as follows. In Section 2, we briefly review the mobile sensing network model 
and the notation related to a graph. A nonparametric approach to predict a field of interest based on 
measurements is presented in Section 3. Section 4 introduces a covariance function learning algorithm 
for an anisotropic, spatio-temporal Gaussian process. An optimal navigation strategy is described in 
Section 5. In Section 6, simulation results illustrate the usefulness of our approach and its adaptability 
for unknown and/or time- varying covariance functions. 

The standard notation will be used in the paper. Let K, M>o, Z denote, respectively, the set of real, 
non-negative real, and integer numbers. The positive semi-definiteness of a matrix A is denoted by 
A y 0. Let \B\ denotes the determinant of a matrix B. E denote the expectation operator. 

2. Mobile Sensor Networks 

First, we explain the mobile sensing network and the measurement model used in this paper. Let N s 
be the number of sensing agents distributed over the surveillance region Q G R 2 . Assume that Q is a 
compact set. The identity of each agent is indexed by X :— {1, 2, • • • , N s }. Let g,(t) G Q be the location 
of the i-th sensing agent at time t G R>o. We assume that the measurement y(q i (t),t) of agent i is the 
sum of the scalar value of the Gaussian process z(qi(t),t) and sensor noise Wi(t), at its position q^t) and 
some measurement time t, 

y(q t (t),t) := z(qi(t),t) +Wi(t). 

The communication network of mobile agents can be represented by a graph with edges. Let 
G(t) = (X,£(t)) be an undirected communication graph such that an edge G £(t) if and only 

if agent i can communicate with agent j ^ i. We define the neighborhood of agent i at time t by 
Ni(t) := {j | G £(£), i G X}. We also define the closed neighborhood of agent i at time t by the 
union of its index and its neighbors, i.e., Ni(t) := {i} U Ni(t). 

3. The Nonparametric Approach 

With the spatially distributed sampling capability, agents need to estimate and predict the field of 
interest by fusing the collective samples from different space and time indices. We show a nonparametric 
approach to predict a field of interest based on measurements. We assume that a field undergoing a 
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physical transport phenomenon can be modeled by a spatio-temporal Gaussian process, which can be 
used for nonparametric prediction. 

Consider a spatio-temporal Gaussian process: 



z(s, t) ~ QV (fi{s, t), JC(s, t, s', t')) 



(1) 



where s,s' G Q, t,t' G M> 0 and n(s,t) denotes the mean value at location s and time t. We then 
propose the following generalized covariance function fC(s,t,s',t';ty) with a hyperparameter vector 

* := [ o f o x o y o t ] T : 



JC(s, t, s', t; \&) = (Tj exp 




exp 



f\2 



it ~ g ) 

2<rf 



(2) 



where is the Z-th entry of s. {cr x , <r y } and 07 are kernel bandwidths for space and time, respectively. 
Equation (2) shows that points close in the measurement space and time indices are strongly correlated 
and produce similar values. In reality, the larger temporal distance two measurements are taken 
with, the less correlated they become, which strongly supports our generalized covariance function in 
Equation (2). This may also justify the truncation (or windowing) of the observed time series data to 
limit the size of the covariance matrix for reducing the computational cost. 

In the case that the global coordinates are different from the local model coordinates, a similarity 
transformation can be used to address this issue. For instance, a rotational relationship between the 
model basis {e x , e y ] and the global basis {E x , E y ] is: 



E x 
E, 



where 9 represents the angle of rotation. We can then use the following relationship to change the 
coordinates: 

x = X cos 9 + Y sin 9 
y = -X sin 9 + Y cos 9 

where x and y indicate coordinates in the local basis and X and Y indicate their counterparts in the 
global basis. Equation (2) can then be rewritten in terms of global coordinates as 







cos 9 


sin# 


e y 




— sin 9 


cos 9 



JC(s,t, s',t'; = o-^exp ^~ 



• exp 



[(sx — s' x ) cos 9 + (sy — s' Y ) sin 9}' 
2^ ~ 
-(sx — s' x ) sin9 + (sy — s' Y ) cos#]' 
^ 2^ " 



exp 



r\2 



it ~ g ) 
2aj 



where s x and s Y are the coordinates in the global basis. In this case, the parameter vector \I> is redefined 
as \& := [ Of o x o y o t 9 ] T . 

Up to time agent i has noisy collective data {y(qj{t m ), t m ) \ m G Z, j G Ni(t m ), 1 < m < fc}, 
where Ni(t m ) denotes the closed neighborhood of agent i at time t m . The measurements 
y(qj(t m ),t m ) = z(qj(t m ),t m ) + Wj(t m ) are taken at different positions qj(t m ) G Q and different 
times t m G IR>o- The measurements are corrupted by the sensor and communication noises 
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represented by Gaussian white noise Wj ~ A/"(0,er^). For the case in which the noise level a w is 
not known and needs to be estimated, the hyperparameter vector can be expanded to include a w , i.e., 
:= [ <jf o x (j y a t a w ] T . The column- vectorized measurements collected by agent i is denoted 

by 

Y< k := col (y{qj{t m ),t m ) I m E Z, j G N t (t m ), 1 < m < k) 
with a joint distribution 

:= (27r)"/2|s y< Ji/2 ex P (-^( y <*-^*) TE ?< fc ( 1 <*-^<j) 

where n is the total number of observations up to time t k , [iy <k '■= K(Y<k) is the mean 
vector of Y< k , £y <fc := E ((F< fc — yUy <fc )(Y< fe — /Uy<J T ) is the covariance matrix of Y< fc 
obtained by [£y <fc ]jj = /C(sj, ij, Sj, tj) + a^Sij in which 5^ denotes the Kronecker delta function. 

If the covariance function is known a priori, the prediction of the random field z(s, t) at location s 
and time t is then obtained by 

z(s, t\t k ) := z{s, t) | F< fc ~ M (z(s, t\t k ), a 2 (s, t\t k )) (3) 

where z{s, t\t k ) := E (z(s, is 

£(s, t|*k) := fi(s,t) + £*y< fc £y* fc (*<* ~ HY< k ) 
and the prediction error variance is 

a 2 (s,t\t k ) : = S z - E zYik E Y * k E Y < kZ 

where S z is the covariance of z, obtained by JC(s,t,s,t; S 2 y <ft = £y <fc2 is the covariance matrix 
between z and F<fe, obtained by [E 2 y <fc ]j = )C(s,t, Sj,tj] Each agent can then predict the field of 
interest at any location and time with the associated uncertainty in a nonparametric way. In the next 
section, we present a learning approach for unknown covariance functions. 

4. The MAP Estimate of the Hyperparameter Vector 

Without loss of generality, we use a zero mean Gaussian process z(s, t) ~ QV(0, /C(s, t, s', £')) for 
modeling the field undergoing a physical transport phenomenon. This is not a strong limitation since the 
mean of the posterior process is not confined to zero [11]. 

If the covariance function of a Gaussian process is not known a priori, mobile agents need to estimate 
parameters of the covariance function (\&) based on the observed samples. Using Bayes' rule, the 
posterior \ Y< k ) is proportional to the likelihood p(Y< k \^) times the prior i.e., 

P (*\Y< k ) ocp(y< fc |*)p(¥) 

At time t k , the maximum a posteriori (MAP) estimate ^/ k of the hyperparameter vector can be obtained 
by 

^ fc = argmaxp(*|F< fc ) 

* (4) 

= argmaxp(l<fe|$)p(^ r ) 
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This is equivalent to maximize the logarithm of the posterior p(ty\Y< k ), i.e., 

if k = argmax (\np(Y< k \ty) + \np(^/)) 
The log likelihood function is given by 

1 In 

lnp(y< fc |^) = --y< T fe E-^y< fc - -m|E y J - -ln27r 

where n is the size of Y< k . Notice that if no prior information is given, the MAP estimate in 
Equation (4) is equal to the maximum likelihood (ML) estimate. 
A gradient ascent algorithm is used to find a MAP estimate of 

= + 4 ) V^ln P (* ( i ) \Y< k ), *>0 

k 

where ef) is a small positive number which can be obtained by using a backtracking line search and 
V 'xf(x) is the partial derivative of f(x) with respect to x. The partial derivative of the log likelihood 
function with respect to a hyperparameter ipj is given by 

d\n P (Y< k \v) _ i as y t 1 / <9£ y< 

where (3 = Y> Y ^Y< k . Alternatively, a simplex search method [20] can be used to find a MAP estimate 
of ^. This is a direct search method that does not use numerical or analytic gradients. 

After finding a MAP estimate of ^ agents can proceed the prediction of the field of interest using 
Equation (3). 

5. An Adaptive Sampling Strategy 

Agents should find new sampling positions to improve the quality of the estimated covariance function 
in the next iteration at time t k+1 . For instance, to precisely estimate the anisotropic phenomenon, i.e., 
processes with different correlations along a>axis and y-axis directions, sensing agents need to explore 
and sample measurements along different directions. 

To this end, we consider a centralized scheme. Suppose that a leader agent (or a central station) 
knows the communication graph at the next iteration time t k+ i and also has access to all measurements 
collected by agents. Let Y k+ \ and Y< k be the measurements at time t k+ \ and the collective measurements 
up to time t k , respectively, i.e., 

Yfc+i :=col (y(qi(t k+1 ),t k+ i) \ i G Z) , 
Y< k :=col (y(qi{t m ),t m ) \ m G Z, i G X, 1 < m < k) 

To derive the optimal navigation strategy, we compute the log likelihood function of observations of 

Y<k+i'- 

£(Y< k+1 ,*) :=\np(Y< k+1 \V) 

1 r -i 1 . n <k+ i W 

= " 9 y <fc+l E Y< fe+i y <fc+l - 9 ln l E ^< fc+ il - -=7j— ln27r 
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where n< k+ \ is the size of Y< k+1 . 

Since the locations of observations in Y< k were already fixed, we represent the log likelihood function 
in terms of a vector of future sampling points q at time t k+1 only and the hyperparameter vector ^: 

£(g,vD) : =lnp(Y< k+1 (qm 

Now consider the Fisher Information Matrix (FIM) that measures the information produced by 
measurements Y< k+1 for estimating the hyperparameter vector at time t k+1 . The Cramer- Rao lower 
bound (CRLB) theorem states that the inverse of the FIM is a lower bound of the estimation error 
covariance matrix [19,21]: 



E ((* fc+1 - *)(*fc+i " ^) T ) >= FIM 



where ^fc+i represents the estimation of \I/ at time t k+ \. The FIM [19] is given by 

[FIM(g, n .. = — E ( d l C{ ^ ) ) (7) 



dipidip 



where the expectation is taken with respect to p(Y< k+1 \^) . The analytical closed-form of FEVI is given 
by 

[FIM(^)]^. = I tr (V ^f^^y 1 ^f^] 
1 vy ' ni] 2 V - k+1 - k+1 ) 

Since the true value of \P is not available, we will evaluate the FIM in Equation (7) at the currently 
available best estimate ^! k . This has been an effective practical solution when we evaluate the FIM and 
estimate ^ simultaneously [22,23]. The term due to the MAP estimation error in evaluating the FIM in 
Equation (7) will decrease as the number of samples increases. 

We can expect that minimizing the CRLB results in a decrease of uncertainty in estimating \l> [22]. 
Using the D-optimality criterion [24,25], the objective function J is given by 

J(q,^ k ) :=det(FIM-\q,y k )) (8) 



Minimizing J in Equation (8) corresponds to minimizing the volume of the ellipsoid which represents 
the maximum confidence region for the estimated hyperparameters. However, if one hyperparameter has 
a much larger variance compared to the others, minimizing the volume may not be very useful [25]. As 
an alternative, the A-optimality which minimizes the sum of the variances may be used. The objective 
function J based on the A-optimality criterion is 

J(g,* fc ) ^tr^FIM" 1 ^,^)) (9) 

A control law for the mobile sensor network can be formulated as follows: 

q(t k+1 ) = arg min J (q, (10) 

<?eQ 

A gradient descent strategy can be used to find the next optimal sampling positions: 



q 



qfW_ a (0 V , (0 J(g w ,* fc ) (11) 
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where aS 1 ' is a small positive number which can be obtained by using a backtracking line search. 
Alternatively, a control law for the mobile sensor network can be formulated as follows: 

q(t k+1 ) = &rg min J(q,^ k ) 

where 

3=1 

where n d = 2 denotes the dimension of the surveillance region Q and 5j is the maximum distance for 
each agent to move in x and y directions. 

However, optimization on \iap(Y< k+ i\^/) in Equation (6) and J(q, ^ k ) in Equation (8) or (9) can be 
numerically costly due to the increasing size of Sy <fc . One way to deal with this problem is to use a 
truncated date set 

Y k -s<,<k ■= col (y(sj(t m ),t m ) \ meZ,jeT,k-8<m<k) 

instead of using Y< k . In addition, this approach based on the truncated observations can be viewed 
as a strategy to deal with a slowly time-varying parameter vector \P, which will further investigated in 
Section 6.2. 

The overall protocol for the sensor network is summarized as in Table 1. 

Table 1. An adaptive sampling strategy for mobile sensor networks. 



1. Learning: At time t k , the sensor network updates ^ k using a MAP estimate Equation (4) 
for a data set Y< k . Start this MAP optimization with the initial point ^fc_i. 

2. Prediction: For given Y< k and ty k , agents can compute prediction at any point and time 
using Equation (3), i.e., p(z(s, t)\Y< k ; ty k ). 

3. Sampling: Based on {^ k ,Y< k }, the sensor network computes the control Equation (10) 
in order to maximize J(q,^/ k ). Update the positions of agents accordingly and collect 
measurements at time t k+1 . 

4. Repeat the steps 1-3 until ^ converges. 



6. Simulation Results 

In this section, we evaluate the proposed approach for a spatio-temporal Gaussian process 
(Section 6.1) and an advection-diffusion process (Section 6.3). For both cases, we compare the 
simulation results using the proposed optimal sampling strategy with results using a benchmark random 
sampling strategy. In this random sampling strategy, each agent was initially randomly deployed in the 
surveillance region. At each time step, the next sampling position for agent i is generated randomly 
with the same mobility constraint, viz. a random position within a square region with length 2 centered 
at the current position qi. For fair comparison, the same values were used for all other conditions. In 
Section 6.2, our approach based on truncated observations has been applied to a Gaussian process with 
a time- varying covariance function to demonstrate the adaptability of the proposed scheme. 



Sensors 2011, 11 



3059 



6.1. A Spatio-Temporal Gaussian Process 

We apply our approach to a spatio-temporal Gaussian process. The Gaussian process was numerically 
generated for the simulation [11]. The hyperparameters used in the simulation were chosen such that 
\P = [ Of a x o y a t a w } T = [ 5 4 2 8 0.5 ] T . Two snap shots of the realized Gaussian 
random field at time t = 1 and t = 20 are shown in Figure 1 . In this case, N s = 5 mobile sensing agents 
were initialized at random positions in a surveillance region Q = [ 0 20 ] x [ 0 20 ]. The initial 
values for the algorithm were given to be = [ 1 10 10 1 0.1 ] T . A prior of the hyperparameter 
vector has been selected as 

where p(er/) = p(<J x ) = p(cr y ) = p(&t) = r(5, 2), and p(a w ) = r(5,0.2). T(a,b) is a Gamma 
distribution with mean ab and variance ab 2 in which all possible values are positive. The gradient method 
was used to find the MAP estimate of the hyperparameter vector. 

Figure 1. Snap shots of the realized Gaussian process at (a) t = 1 and (b) t = 20. 




0 5 10 15 20 0 5 10 15 20 



(a) (b) 

For simplicity, we assumed that the global basis is the same as the model basis. We considered a 
situation where at each time, measurements of agents are transmitted to a leader (or a central station) 
that uses our Gaussian learning algorithm and sends optimal control back to individual agents for next 
iteration to improve the quality of the estimated covariance function. The maximum distance for agents 
to move in one time step was chosen to be 1 for both x and y directions. The A-optimality criterion was 
used for optimal sampling. 

For both proposed and random strategies, Monte Carlo simulations were run for 100 times and the 
statistical results are shown in Figure 2. The estimates of the hyperparameters (shown in circles and 
error bars) tend to converge to the true values (shown in dotted lines) for both strategies. As can be 
seen, the proposed scheme (Figure 2(a)) outperforms the random strategy (Figure 2(b)) in terms of the 
A-optimality criterion. 
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Figure 2. Monte Carlo simulation results (100 runs) for a spatio-temporal Gaussian 
process using (a) the random sampling strategy, and (b) the adaptive sampling strategy. 
The estimated hyperparameters are shown in blue circles with error-bars. The true 
hyperparameters that used for generating the process are shown in red dashed lines. 



7 
6 

b 5 
4 
3 






5 — 6 


5— < 


5— ( 




k 


S — C 










>— C 


>— < 


J— < 






r 


) — 1 
















5— ( 


S— « 


(— « 








5— « 


>— « 


5 — O 



2 4 



10 12 14 16 18 20 




10 12 14 16 18 20 



10 























-<> — It- 









2 4 




10 12 14 16 18 20 



"j ^cji ^ JjL ^ _cjL ^ ^_ ^. Jj i 



2 4 



10 12 14 16 18 20 

k 




(a) 



(b) 



Figure 3 shows the predicted field along with agents' trajectories at time t = 1 and t = 20 for one trial. 
As shown in Figure 1(a) and Figure 3(a), at time t — 1, the predicted field is far from the true field due 
to the inaccurate hyperparameters estimation and small number of observations. As time increases, the 
predicted field will be closer to the true field due to the improved quality of the estimated the covariance 
function and the cumulative observations. As expected, at time t = 20, the quality of the predicted field 
is very well near the sampled positions as shown in Figure 3(b). With 100 observations, the running time 
is around 30s using Matlab, R2008a (MathWorks) in a PC (2.4 GHz Dual-Core Processor). No attempt 
has been made to optimize the code. After converging to a good estimate of ^, agents can switch to a 
decentralized configuration and collect samples for other goals such as peak tracking and prediction of 
the process [6,16,17]. 
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1 and (b) t = 20. 
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6.2. Time -Varying Covariance Functions 

To illustrate the adaptability of the proposed strategy to time-varying covariance functions, we 
introduce a Gaussian process defined by the following covariance function. The time-varying covariance 
function is modeled by a time-varying weighted sum of two known covariance functions JCi(-) and /C 2 (-) 
such as 

/C(-) = A(t)/d(-) + (l-A(t))/C 2 (-) (12) 

where A(i) G [0, 1] is a time- varying weight factor that needs to be estimated. In the simulation study, 
/Ci(-) is constructed with Oj — 1, <j x — 0.2, a y = 0.1, a t = 8, and a w = 0.1; and /C 2 (-) is with aj = 1, 
a x = 0.1, (j y = 0.2, a t = 8, and a w = 0.1. This Gaussian process defined in (12) with theses particular 
/Ci and /C 2 effectively models hyperparameter changes in x and y directions. 

To improve the adaptability, the mobile sensor network uses only observations sampled during the last 
20 iterations for estimating A(t) online. The true X(t) and the estimated \(t) are shown in Figure 4(a,b), 
respectively. From Figure 4, it is clear that the weighting factor X(t) can be estimated accurately after 
some delay about 5-8 iterations. The delay is due to using the truncated observations that contain past 
observations since the time-varying covariance function changes continuously in time. 

6.3. Fitting a Gaussian Process to an Advection-Diffusion Process 



We apply our approach to a spatio-temporal process generated by physical phenomena (advection and 
diffusion). This work can be viewed as a statistical modeling of a physical process, i.e., as an effort to fit 
a Gaussian process to a physical advection-diffusion process in practice. The advection-diffusion model 
developed in [26] was used to generate the experimental data numerically. An instantaneous release 
of Qkg of gas occurs at a location (xo, yo, z 0 ). This is then spread by the wind with mean velocity 
u — [ u x 0 0 ] T Assuming that all measurements are recorded at a level z = 0, and the release 
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Figure 4. (a) The weighting factor \{t) and (b) the estimated A(£). 
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Table 2. Parameters used in simulation. 



Parameter 


Notation 


Unit 


Value 


Number of agents 


N s 




5 


Sampling time 


t s 


min 


5 


Initial time 


to 


min 


100 


Gas release mass 


Q 


kg 


10 6 


Wind velocity in x axis 




m/min 


0.5 


Eddy diffusivity in x axis 


K x 


m 2 /min 


20 


Eddy diffusivity in y axis 


Ky 


m 2 /min 


10 


Eddy diffusivity in z axis 


K z 


m 2 /min 


0.2 


Location of explosion 


x 0 


m 


2 


Location of explosion 


yo 


m 


5 


Location of explosion 


z 0 


m 


0 


Sensor noise level 


cr-w 


kg/m 3 


0.1 



occurs at a ground level (i.e., z 0 = 0), the concentration C at an arbitrary location [x, y, 0) and time t is 
described by the following analytical solution [27]: 



C(x,y,0,t) 



Q exp 



(Ax-uAt) 2 
iK x At 



Ay 2 



(13) 



where Ax = x — x 0 , Ay = y — y 0 , and At = 5(t — 1) + to- The parameters used in the simulation 
study are shown in Table 2. Notice that this process generates an anisotropic concentration field with 
parameters K x = 20m 2 /min and K y = 10m 2 /min as in Table 2. The fields at time t — 1 and t — 20 
are shown in Figure 5. Notice the center of the concentration moved. In this case, N s = 5 mobile sensing 
agents were initialized at random positions in a surveillance region Q = [ —50 150 ] x [ —100 100 ]. 
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The initial values for the algorithm was chosen to be \E f< -°^ ) = [ 100 100 100 ] T where we assumed 
Of = 1 and a w = 0.1. For this application, we did not assume any prior knowledge about the covariance 
function. Hence, the MAP estimator was the same as the ML estimator. The gradient method was used 
to find the ML estimate. 

We again assumed that the global basis is the same as the model basis and assumed all agents have 
the same level of measurement noises for simplicity. In our simulation study, agents start sampling at 
to = lOOmin and take measurements at time tk with a sampling time of t s = 5min as in Table 2. 

Monte Carlo simulations were run for 100 times, and Figure 6 shows the estimated a x , a y , and a t 
with (a) the random sampling strategy and (b) the optimal sampling strategy, respectively. With 100 
observations, the running time at each time step is around 20s using Matlab, R2008a (Math Works) in a 
PC (2.4 GHz Dual-Core Processor). No attempt has been made to optimize the code. As can be seen 
in Figure 6, the estimates of the hyperparameters tend to converge to similar values for both strategies. 
Clearly, the proposed strategy outperforms the random sampling strategy in terms of the estimation error 
variance. 

7. Summary 

In this paper, we presented a novel class of self-organizing sensing agents that learn an anisotropic, 
spatio-temporal Gaussian process using noisy measurements and move in order to improve the quality 
of the estimated covariance function. The MAP estimator was used to estimate the hyperparameters in 
the unknown covariance function and the prediction of the field of interest was obtained based on the 
MAP estimates. An optimal navigation strategy was proposed to minimize the information-theoretic 
cost function of the Fisher Information Matrix for the estimated hyperparameters. The proposed scheme 
was applied to both a spatio-temporal Gaussian process and a true advection-diffusion field. Simulation 
study indicated the effectiveness of the proposed scheme and the adaptability to time- varying covariance 
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functions. The trade-off between a precise estimation and computational efficiency using truncated 
observations will be studied in the future work. 

Figure 6. Simulation results (100 runs) for a advection-diffusion process. The estimated 
hyperparameters with (a) random sampling and (b) optimal sampling. 







(a) 



(b) 



Acknowledgment 

This work has been supported by the National Science Foundation through CAREER Award 
CMMI-0846547. This support is gratefully acknowledged. 

References 



1. Lynch, K.M.; Schwartz, I.B.; Yang, R; Freeman, R.A. Decentralized environmental modeling by 
mobile sensor networks. IEEE Trans. Robot. 2008, 24, 710-724. 

2. Tanner, H.G.; Jadbabaie, A.; Pappas, G.J. Stability of Flocking Motion; Technical Report 
MS-CIS-03-03, The GRASP Laboratory, University Pennsylvania, Philadelphia, PA, USA, May 
2003. 

3. Olfati-Saber, R. Flocking for multi-agent dynamic systems: Algorithms and theory. IEEE Trans. 
Automat. Contr. 2006, 51, 401^420. 



Sensors 2011, 11 



3065 



4. Ren, W.; Beard, R.W. Consensus seeking in multiagent systems under dynamically changing 
interaction topologies. IEEE Trans. Automat. Contr. 2005, 50, 655-661. 

5. Leonard, N.E.; Paley, D.A.; Lekien, R; Sepulchre, R.; Fratantoni, D.M.; Davis, R.E. Collective 
motion, sensor networks, and ocean sampling. Proc. IEEE 2007, 95, 48-74. 

6. Choi, J.; Oh, S.; Horowitz, R. Distributed learning and cooperative control for multi-agent 
systems. Automatica 2009, 45, 2802-2814. 

7. Cressie, N. Kriging nonstationary data. J. Ante. Statist. Assn. 1986, 81, 625-634. 

8. Cressie, N. Statistics for Spatial Data; A Wiley-Interscience Publication, John Wiley and Sons, 
Inc.: Chichester, West Sussex, UK, 1991. 

9. Gibbs, M.; MacKay, D.J.C. Efficient implementation of Gaussian processes. Available online: 
http://www. cs. toronto. edu/mackay/gpros.ps.gz.Preprint/ (accessed on 3 March 2011). 

10. Mackay, D.J.C. Introduction to Gaussian Processes. In NATO ASI Series. Series F: Computer 
and System Sciences; Springer- Verlag: Heidelberg, Germany, 1998; pp. 133-165. 

11. Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; The MIT Press: 
Cambridge, MA, USA, 2006. 

12. Krause, A.; Guestrin, C; Gupta, A.; Kleinberg, J. Near-optimal sensor placements: Maximizing 
information while minimizing communication cost. In Proceedings of the 5th International 
Symposium on Information Processing in Sensor Networks (IPSN), Nashville, TN, USA, April 
2006. 

13. Krause, A.; Singh, A.; Guestrin, C. Near-optimal sensor placements in Gaussian processes: 
Theory, efficient algorithms and empirical studies. J. Mach. Learn. Res. 2008, 9, 235-284. 

14. Cortes, J. Distributed Kriged Kalman filter for spatial estimation. IEEE Trans. Automat. Contr. 
2010,54, 2816-2827. 

15. Graham, R.; Cortes, J. Cooperative adaptive sampling of random fields with partially known 
covariance. Int. J. Robust Nonlinear Contr. 2009, 1, 1-2. 

16. Choi, J.; Lee, J.; Oh, S. Swarm intelligence for achieving the global maximum using 
spatio-temporal Gaussian processes. In Proceedings of the 27th American Control Conference 
(ACC), Seattle, WA, USA, June 2008. 

17. Choi, J.; Lee, J.; Oh, S. Biologically-inspired navigation strategies for swarm intelligence using 
spatial Gaussian processes. In Proceedings of the 17th International Federation of Automatic 
Control (IFAC) World Congress, Seoul, South Korea, July 2008. 

18. Nott, D.J.; Dunsmuir, W.T.M. Estimation of nonstationary spatial covariance structure. 
Biometrika 2002, 89, 819-829. 

19. Kay, S.M. Fundamentals of Statistical Signal Processing: Estimation Theory; Prentice Hall, Inc.: 
Upper Saddle River, NJ, USA, 1993. 

20. Lagarias, J.; Reeds, J.; Wright, M.; Wright, P. Convergence properties of the Nelder-Mead 
simplex method in low dimensions. SI AM J. Optimization 1999, 9, 112-147. 

21. Mandic, M.; Franzzoli, E. Efficient sensor coverage for acoustic localization. In Proceedings of 
the 46th IEEE Conference on Decision and Control, New Orleans, LA, USA, December 2007. 

22. Martinez, S.; Bullo, F. Optimal sensor placement and motion coordination for target tracking. 
Automatica 2006, 42, 661-668. 



Sensors 2011, 11 



3066 



23. Nagamune, R.; Choi, J. Parameter reduction in estimated model sets for robust control. J. Dynam. 
Syst. Meas. Contr. 2010, 132, 021002. 

24. Pukelsheimi, F. Optimal Design of Experiments; Wiley: New York, NY, USA, 1993. 

25. Emery, A.F.; Nenarokomov, A.V. Optimal experiment design. Meas. Sci. Technol. 1998, 
9, 864-76. 

26. Kathirgamanathan, P.; McKibbin, R. Source term estimation of pollution from an instantaneous 
point source. Res. Lett. Infor. Math. Sci. 2002, 3, 59-67. 

27. Christopoulos, V.N.; Roumeliotis, S. Adaptive sensing for instantaneous gas release parameter 
estimation. In Proceedings of the 2005 IEEE International Conference on Robotics and 
Automation, Barcelona, Spain, 2005. 

© 2011 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article 
distributed under the terms and conditions of the Creative Commons Attribution license 
(http://creativecommons.Org/licenses/by/3.0/.) 



